DOCOBSMT BBSOHS 

BD 106 364 TH 004 482 



AUTBOB 
TITlii 

PUB DATE 
HOTE 



iBpara, Jaies C. 

Deteraining Assessient Content^^Heeting Heal 

Needs* 

[Apr 75] 

14p«; Paper presented at the Annual fleeting of the 
American Edacational Research Association 
(Washington, D.C., March 30-April 3, 1975) 



ECRS PSICE 
DBSCBIPTOBS . r 



DEKTIPIEBS 



HF-$0.76 HC-$1.58 PLUS POSTAGE 
Credibility; ^Educational Assessment; Educational 
Change; Evaluation; ^Evaluation Criteria; Evaluation 
Hethods; ♦Evaluation Needs; ♦Objectives; Program 
Costs; Belevance (Education) ; ♦State Programs 
Oregon Assessment Program 



ABSTBACT 

The determination of assessment content is often made 
on the basis of cost, political ^clout**, and relevance, in that 
order. Three areas of assessment content are discussed: Broad areas, 
specific areas to be measured, and non*test information. The broad 
areas and non-test information are policy issues, while the 
determination of specific outcomes is a more technical one. Several 
criteria are suggested for aiding policy makers in determining broad 
areas and the non-test information vhich is to be included. The issue 
of determining specific outcomes may occur at the initial planning 
stages of assessment or as suggested by Dyer, following an assessment 

trial** run. In either case the specific outcomes should be 
determined on the basis of the involvement of professional educators 
and nonprofessional educators. Oregon's methodology for determining 
specific outcomes for assessment is included. (Author/BJG) 



DETERMINING ASSES SMEI^T COi>ITENT - MEETING REAL NEEDS 



us DEPAPTMENTOF HEALTH. 
EDUCATION tWELFARE 
NATIONAHNSTITUTEOF 
EDUCATION 

THIS D0CUM6NT HAS BEEN REPRO 
DUCEO EXACTLY A* (?bCElVED FROM 
THE PEPSON Ofi OROANlZATION ORIGIN 
ATINO IT POINTS OP VIEW OR OPINIONS 
STATED DO NOT NECESSARILY PFPRE 
^ENT Of- PfCiAL NATIONAL INSTITUTE OF 
EDUCATION POSITION OR POLICY 



Presented by: 

James C. Impara, Director, Statewide Assessment 
Oregon State Department of Education 
Salem» Oregon 



American Educational Research Association 
Annual Meeting 
March 30- April 3, 1975 



In each of the fifty states some sort of statewide assessment program exists. 
Each of these states has made a decision about what to assess. It might appear 
that this paper Is somewhat late In terms of providing alternative strategies for 
determining assessment content. However, content areas to be assessed do change 
and new strategies become necessary. This paper attempts to address the problem 
of meeting real needs In determining assessment content; It Is not directed toward 
the mechanisms for determining content. 

According to Woraer, "The most logical starting point for any large scale 
assessment Is to determine Its prime audiences and their most crucial needs for 
student outcomes Information. Such a focused program probably will he useful to 
others outside the prime audience, whereas an unfocused program may not be of any 
use to anyone." (Womer, p. 16) Clearly, In order to meet needs. It muse be deter- 
mined whose needs will be met. The primary audience for assessment Is usually either 
teachers, local decision-makers, or state level decision makers. On occasion an 
attempt Is made to make more than one of these groups a combined primary audience. 
The difference between these alternatives usually Involves the unit of analysis 
and the nature of the reports, although some content differences may also be 
Indicated. 

Assessment programs have begun with ill tie thought to either *-he purposes of 
the assessment or to Its audiences. In some cases assessment Is begun by legisla- 
tive mandate. In other cases It Is the decision of the Chief State School Officer 
or the State Board of Education. Rarely Is It Initiated by local school districts 
or by teachers. In fact, this author is aware of no such Instance where the later 
group has been the initiator of statewide assessment. Someone however, must make 
the decision about the primary focus and purpose of statewide assessment. This 
decision requires that those who set assessment policy know what type of information 



ERIC 



3 



- 2 - 

they want or need. This can be a problem when those In charge of assessment policy 
do not represent the audience for whom assessment data are Intended. 

It Is not feasible to differentiate the purposes of assessment from the audiences 
to be served by the assessment. If the policy makers for assessment are not repre- 
sentative of the prime audience then some mechanism must exist for enlightening the 
policy makers as to the needs of the audience. Whether or not the primary audience 
is represented in making assessment policy, there are two crucial aspects which 
impinge on the potential or actual utility of the data for meeting real needs. These 
two aspects are (1) trust and (2) Involvement. These two conditions are "necessary 
but not sufficient," i.e. they do not guarentee that assessment data will be used, 
but without these conditions there is a greater likelihood that any assessment pro-- 
gram will become a "shelf" program. (A "shelf" program is one which serves no useful 
purpose other than generating reports, print-outs, etc, that collect dust upon a 
shelf.) 

Although most assessment programs have begun with reading or math, some have 
not followed this trend. Even so, cognitive outcomes are most commonly the primary 
focus. A few states have Implemented assessment in the affective domain, while 
even fewer have assessed psychomotor skills. The content of assessment can be con- 
sidered to incltide several dimensions: 

(1) The broad areas upon which assessment is focused, such as reading, math, 
citizenship, or self concept. 

(2) The specific outcomes within these areas. 

(3) The non-test information used to help explain or to categorize assessment results, 
such as sex of student, parents educational level, school size, or expenditures 
per pupil. 

Determining the broad areas and the non-test information are clearly policy 
matters, while selecting the specific outcomes is a more technical issue. Both of 



4 



- 3 - 

the policy concerns can be (and have been) resolved in various ways. The decision 
on the broad areas is often made by legislation, as in Florida, or by the Chief 
State School Officer or by an "Advisory Committee" as in Oregon. The decision is 
often based upon input received through "happenings" such as town hall meetings, 
Delphi techniques, public opinion polls, or discussions (formal or informal) with 
key legislatures or key school district superintendents. Regardless of how the 
decision is made, a number of factors are usually considered: 

(1) The estimated costs of the program, in particular the costs required for 
acquiring or developing the test as a function of content. 

(2) The political impact of the broad area to be assessed. That is, how much 
"clout" will the published results have in bringing about needed instructional 
improvement or redistribution of resources. 

(3) The relevance of the results to the primary assessment audience. (In sone 
cases "clout" and relevance are the same depending on the area and the aud- 
ience. ) 

It is not unusual for these three considerations tj occur in the order in which 
they are listed. Thus, in considering assessment content the relevance of the 
results is often not the foremost concern. Under these conditions how then can 
we determine assessment content which will meet real needs. 

If the primary audience is to be the classroom teacher, then statewide assess- 
ment probably can not do an effective job of meeting their needs. Tn fact, this 
author is becoming more and more convinced that this is a correct point of view* 
That is not to say that classroom teachers can not be users of assessment data* It 
is to say that teachers should not be the primary assessment audience. Their needs 
may best be met by local school district assessment or testing programs. The ration- 
ale for this statement is that the needs of teachers are simply too diverse to be met 
by a statewide assessment program. If not teachers, who then should be the primary 



ERiC 



5 



- 4 - 

audience? I believe that the focus of assessment should be state-level decision- 
makers • 

No matter which state-level audience is detcrnined to be the primary one, the 
results will still have an impact (if these results are reported in a useful way) 
on the operation of local schools. In some instances, the impact may be very direct 
and immediate, in other instances, more indirect and over a longer term. What then 
are the Implications for this position for determining assessment content? One 
clear implication is that whatever is measured should be almost universally accepted 
as being Important, or should be "visible" enough to justify the high cost of assess- 
ment. Another implication is that the credibility of the results must be high, i.e. 
high trust level must exist between the assessment program, and the primary audience, 
arA for those who will be affected by the decisions or actions of this primary aud- 
ience. While these implications may seem trivial or at least obvious, they present 
some interesting dilemmas for deciding on both the general areas to be assessed, as 
well as the specific content. These dilemmas can only be resolved when the specific 
purposes of assessment are known and agreed upon by the policy makers. 

Depending on the nature of the intended decisions or actions, other inqjlications 
are also apparent and they relate to issues such as the type of test (mastery or 
differential), and data collection methodology (sampling or census), he type of 
report (interpretive or not interpretive). These issues are discussed elsewhere and 
are not considered in this presentation. 

It is generally agreed that each of the dimensions of assessment content 
(broad areas, specific outcomes to be measure, and non-test information) should 
be determined at the early stages of assessment planning prior to any testing. 
If we assume that statewide assessment is not a one-shot affair, but will continue 



ERIC 



6 



- 5 - 

on a long term basis, then we must further assume t^^^at the specific objectives of 
assessment will change periodically. But will the broad areas change ove^ time, 
and will the non-test data change over time? Probably not to any large extent, 
although areas may be added and additional non-test information may be requested, 
-(the general trend in data collection Is not to delete or purge information rather 
it is to ask for more information). Thus, the issues of which broad areas will be 
of most relevance to the policy-makers and which non-test information will provide 
this audience with useful information for decision making must be considered early. 
While costs and "clout" are important factors, the relevance of the assessment 
results shouH not be overlooked in making decisions on broad areas and non-test 
information. 

The following criteria are suggested for making the determination of the broad 
areas to be assessed when given student performance information: 

(1) Will it be sufficient to meet the "public*s right-to-know" about the status and 
condition of pupil performance in the public (and/or nonpublic) schools? 

(2) Is there an action or set of actions which can be taken to bring about needed 
change? 

The change may be in relation to resource allocation, or to a different emphasis on 
instruction in the area. 

The non-test information should also be examined against these . Mteria. There 
are many items of non-test information wl.lch can be collected. Some of these items 
are correlated with pupil achievement. However, is it feasible to make changes based 
upon this information? Sometimes the answer may be yes even though the policy--makers 
have no control over the condition or classification. For example, policy-makers 
can not control a student's race, but they may control resources which can be differ- 
entially allocated to students who are members of a particular race, or to schools 
with a high concentration of students of a particular race. 



ERiC 



7 



- 6 - 



These criteria in themselves do not suggest what assessment content ought to 
be. They represent Instead a test for lusurlng that needs can be met. When deciding 
whether assessment should focus on cognitive outcomes, psychomotor outcomes, or 
affective outcomes, these criteria become quite rele-^ant. After a preliminary 
decision has been made several questions must be answered by the policy-makers. 
These questions include: 

(1) If change is needed, how long will it take for change to occur? 

(2) If we know how to bring about change, can we afford the cnsts of doing so, or 
if we do not know how to bring about change, can we afford the cost of f3..^ding 
out? 

(3) Does the public desire such change? 

While these three questions are important they may on o::casion be ignored if the need 
to meet the public's right-to-know is sufficiently strong. 

By what methods should we determine which broad areas to test using these crit- 
eria? If the areas are not mandated, (which is a frequent occur ance> then the assess- 
ment staff must undertake to prepare a case for each of the alternatives which can be 
afforded (assuming that budget limitations of the assessment program are specified). 
This also holds true for determining the non-test information which is to be collected. 
If it is not mandated, then there are several resources one can use in formulating 
various cases for the non-test information (e.g. Bryant et. al., 19'/ * SUNY, 1972). 

How does one determine the specific outcomes which are to be measured in asses- 
men t? It is this area which requires the maximum of involvement of those who will 
be affected by the assessment. The specific outcomes to be measured should be det- 
ermined both by professional educators and by nonprofessional educators. (A non- 
professional educator is any one who is not employed in the field of education. This 
group is often called the "general public" or the "lay community*) Womer suggests 
that the determination of the specific outcomes should be accomplished in the early 



ERiC 



8 



planning stages oi assessment. He goes on to say that "...subject matter objectives 
are not (should not be) the sole property of subject matter specialists." (Womer, p. 56) 
This author agrees with that preference since even though the teacher is vitally 
concerned with what should be learned, so is the student, the parent, board of 
education members, and other tax payers. The process of surveying teachers and others 
is a long and arduous one. To obtain meaningful input requires a great deal of time 
and effort on behalf of all of those who are involved in the assessment. Those 
Individuals include not only the staff ac the state level, but also those in local 
districts who are assisting in the determination selecting objectives. 

While Womer suggested that the determination of the objectives to be assessed 
should be accomplished in the planning stages. Dyer, on the other hand offers an 
alternative approach. He suggests that a case can be made "for the proposition 
that you are more likely to get useful agreement on what the outcomes of education 
ought to be after you have made some assessment of what the outcomes of education 
actually are." (Dyer, p. 23) Unfortunately, Dyer does not provide a methodology 
for undertaking this alternative. The argument in favor of this approach does have 
possibilities since in the initial assessment of an area it may be extremely difficult 
to establish a proper mental "set" for determining the outccnivs of schooling. It 
is difficult sometimes to predict the consequences of an assessment and therefore it 
may be highly desireable to modify what is measured after some experience in inter- 
pretation and utilization of results has occur ed. This may not require a change in 
the procedures for determining assessment objectives in the initial stages of assess- 
ment. It does require however, that a high level of trust exists, so that necessary 
changes can be made once the assessment cycle has begun. 



- 8 - 



SUMMARY 

The determination of assessment content is often made on the basis of costs, 
political "clout", and relevance, in that order. Three areas of assessment content 
are discussed: Broad areas, specific areas to be measured, and non-test information. 
The broad areas and non-test information are policy issues, while the determination 
of specific outcomes is a more technical one. Several criteria are suggested for 
aiding policy makers in determining broad areas and the non-test information which 
is to be included. The issue of determining specific outcomes may occur at the 
initial planning stages of assessment or as suggested by Dyer, following an assess- 
ment "trial" run. In either case the specific outcomes should be determined on the 
basis of the involvement of professional educators and nonprofessional educators. 



10 



OREGON'S METHODOLOGY FOR DETERMINING SPECIFIC OUTCOMES FOR ASSESSMENT 



The selection of performance Indicators (which in another context might be 
called objectives) Includes several components. The Initial step Is to form a 
technical advisory committee which consists of classroom teachers and subject 
matter specialists who have a reputation In the State for being highly qualified 
In the subject area to be tested. This group is brought together to acconqillsh 
several purposes: 

(1) To define in general terms the subject matter to be assessed. 

(2) To oet up appropriate subdivisions (domains) within the scope of the definition. 
For example in mathematics these broad areas might include geometry, arithmetic, 
measurement, problem solving, and probability and statistics. 

(3) To recommend existing sources of performance indicators. These sources often 
include the Instructional Objectives Exchange, National Assessment, and local 
education agency indicators. 

(4) To review the collection of performance indicators to insure that they meet a 
pre-established set of criteria. These criteria include statements pertaining 
to relevance, appropriateness for grade level, appropriateness to domain, and 
clarity of content (understandability) . 

(5) Make a preliminary selection of those performance indicators which in the 
review panel's opinion, arc the most important and essential for student 
success in everyday living. 

(6) To make the final decision on any performance indicators which have a "tie 
score" as a result of the field selection process. 

The second component involves the distribution of a pxelimiparj set of per- 
formanca indicators to both professional and nonprofessional educators around the 
state. This step is accomplished subsequent to item 5 above. These performance 
indicators are distributed to "assessment coordinators" employed by each county 
(either county unit or intermediate education district) who are assigned by the 
superintendent certain responsibilities in the area of statewide assessment. 
Among these responsibilities is that of assisting in the selection of the performance 



ERIC 



11 



- 2 - 

indicators to be measured in statewide assessment. 

The coordinators are asked to have the performance indicators reviewed by 
local individuals who are professional or nonprofessional educators. The coor- 
dinators then obtain, by whatever means is most convenient, a consensus opinion 
from the participants on the indicators which should be measured in the state- 
wide assessment program. 

The number of indicators to select is given to prevent all those indicators 
sent out from being selected. This number is usually between one-fourth and one- 
half of the total sent out. The determination of the number of performance indicators 
is based upon an estimate of 5-6 items per indicator and a desired testing time 
appropriate to the grade level which is to be assessed* Thus, if fourth grade was 
to be assessed using a 90 minute (two session) test, and if the average time per 
item was estimated at one minute, tl'^en 15 performance indicators would be selected. 

The criteria the state will use for ('etermining if an indicator is selected 
is given at the same tine as the indicators are distributed to the local assessment 
coordinator. This criteria is usually a combination of various weighting schemes. 
For example, each unit is awarded a weight of one, for one portion of the scale 
and foiT another portion is weighted according to the proportion of the state's 
children which are included in that unit. If a unic has 10% of the State's child- 
ren, it would be awarded a weight of .5. In this example each performance indi- 
cator selected must have had an absolute score equal to one-half of the number of 
units responding and it must have had a population weighted score in excess of 50. 
Thus units which have only a small percentage of the State's population can not band 
together and force an indicator to be selected while the units which have a large 
portion of the State's population can not band to i.isure the inclusion of a 



ERiC 



12 



- 3 - 

particular indicator. Ea-zh Indicator must be selected by both a large number of 
units (at least h&lf) and these units must represent a large portion of the state's 
population (at least half) • 

The final component Involves tabulating the consensus from each unit and If 
necessary having the advisory panel resolve tie scores e breaking process 

Is only undertaken If the maximum number of performance indicators desired vould be 
exceeded by including all of the indicators for which ties have occurred. 

Thus "ar it has not been necessary to develop from scratch any performance 
Indicators for the Oregon Assessment program. This is because reading and math are 
the first areas to be assessed, and upon examination of existing performance indi- 
cators, there has been little need to develop new performance indicators. Th. i 
condition is not expected to continue since the next area for assess"^nt is citizen- 
ship. The advisory panel for this area is now being formed, and there is an expect- 
ation that development of some performance indicators will be necessary. 



13 



REFERENCES 



Bryant, Edward C. , Glaser, Ezra; Hanson, Morris; and Klrsh, Arthur. Associations 
Between Educational Outcomes and Background Variables: A Rev iew of Selected Literature, 
NAEP. 1974. ■ 

Dyer, Henry. "The State of the State Assessment ArtJ* Speaking Out; Law, Education 
and Politics. Proceedings of the Invitational Conference on Educational Assessment 
and Educational Policy. ETS. 1974. 

Variables Related to Student Performance and Resource Allocation Decisions at 
the Schoo l District Level . The University of the State of New York, State Education 
Department, Bureau of School Programs Evaluation. 1973. 

Womer, Frank. Developing a Large Scale Assessment Program . Cooperative Accf>unt- 
ability Project, Denver, Colorado. 1973. 



14 



